Speaking style adaptation for spontaneous speech recognition using multiple-regression HMM
نویسندگان
چکیده
This paper describes a rapid model adaptation technique for spontaneous speech recognition. The proposed technique utilizes a multiple-regression hidden Markov model (MRHMM) and is based on a style estimation technique of speech. In the MRHMM, the mean vector of probability density function (pdf) is given by a function of a low-dimensional vector, called style vector, which corresponds to the intensity of expressivity of speaking style variation. The value of the style vector is estimated for every utterance of the input speech and the model adaptation is conducted by calculating new mean vectors of the pdf using the estimated style vector. The performance evaluation results using “Corpus of spontaneous Japanese (CSJ)” are shown under a condition in which the amount of model training and adaptation data is very small.
منابع مشابه
An on-line adaptation technique for emotional speech recognition using style estimation with multiple-regression HMM
This paper describes a model adaptation technique for emotional speech recognition based on multiple-regression HMM (MR-HMM). We use a low-dimensional vector called style vector which corresponds the degree of expressivity of emotional speech as the explanatory variable of the regression. In the proposed technique, first, the value of the style vector for input speech is estimated. Then, using ...
متن کاملRecent Development of HMM-Based Expressive Speech Synthesis and Its Applications
This paper describes the recent development of HMM-based expressive speech synthesis. Although the expressive speech includes a wide variety of expressions such as emotions, speaking styles, intention, attitude, emphasis, focus, and so on, we mainly refer to the speech synthesis techniques for emotions and speaking styles, which would be the most primary expressions in human speech communicatio...
متن کاملA style control technique for speech synthesis using multiple regression HSMM
This paper presents a technique for controlling intuitively the degree or intensity of speaking styles and emotional expressions of synthetic speech. The conventional style control technique based on multiple regression HMM (MRHMM) has a problem that it is difficult to control phone duration of synthetic speech because HMM has no explicit parameter which models phone duration appropriately. To ...
متن کاملAffect-Robust Speech Recognition by Dynamic Emotional Adaptation
Automatic Speech Recognition fails to a certain extent when confronted with highly affective speech. In order to cope with this problem we suggest dynamic adaptation to the actual user emotion. The ASR framework is built by a hybrid ANN/HMM mono-phone 5k bi-gram LM recognizer. Based hereon we show adaptation to the affective speaking style. Speech emotion recognition takes place prior to the ac...
متن کاملAn optimized multi-duration HMM for spontaneous speech recognition
In spontaneous speech, various speech style and speed changes can be observed, which are known to degrade speech recognition accuracy. In this paper, we describe an optimized multi-duration HMM (OMD). An OMD is a kind of multi-path HMM with at most two parallel paths. Each path is trained using speech samples with short or long phoneme duration. The thresholds to divide samples of phonemes are ...
متن کامل